Segmentation of Expository Texts by Hierarchical Agglomerative Clustering

نویسنده

  • Yaakov Yaari
چکیده

We propose a method for segmentation of ex-pository texts based on hierarchical agglomera-tive clustering. The method uses paragraphs as the basic segments for identifying hierarchical discourse structure in the text, applying lexical similarity between them as the proximity test. Linear segmentation can be induced from the identified structure through application of two simple rules. However the hierarchy can be used also for intelligent exploration of the text. The proposed segmentation algorithm is evaluated against an accepted linear segmentation method and shows comparable results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Texplore - Exploring Expository Texts Via Hierarchical Representation

Exploring expository texts presents an interesting and important challenge. They are read routinely and extensively in the form of online newspapers, web-based articles, reports, technical and academic papers. We present a system, called Texplore, which assists readers in exploring the content of expository texts. The system provides two mechanisms for text exploration, an expandable outline th...

متن کامل

An Agglomerative Hierarchical Clustering Algorithm for Labelling Morphs

In this paper, we present an agglomerative hierarchical clustering algorithm for labelling morphs. The algorithm aims to capture allomorphs and homophonous morphemes for a deeper analysis of segmentation results of a morphological segmentation system. Most morphological segmentation systems focus only on segmentation rather than labelling morphs according to their roles in words, i.e. inflectio...

متن کامل

Color Image Segmentation Using Anisotropic Diffusion and Agglomerative Hierarchical Clustering

A new color image segmentation scheme is presented in this paper. The proposed algorithm consists of image simplification, region labeling and color clustering. The vector-valued diffusion process is performed in the perceptually uniform LUV color space. We present a discrete 3-D diffusion model for easy implementation. The statistical characteristics of each labeled region are employed to esti...

متن کامل

Agglomerative connectivity constrained clustering for image segmentation

We consider the problem of clustering under the constraint that data points in the same cluster are connected according to a pre-existed graph. This constraint can be efficiently addressed by an agglomerative clustering approach, which we exploit to construct a new fully automatic segmentation algorithm for color photographs. For image segmentation, if the pixel grid with eight neighbor connect...

متن کامل

Selection of Segment Similarity Measures for Hierarchical Picture Segmentation

The problem of defining appropriate segment similarity measures for picture segmentation is examined. In agglomerative hierarchical segmentation, two segments are compared and merged if found similar. The proposed Hierarchical Step-Wise Optimization (HSWO) algorithm finds and then merges the two most similar segments, on a step-by-step basis. By considering picture segmentation as a piece-wise ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cmp-lg/9709015  شماره 

صفحات  -

تاریخ انتشار 1997